Multi-grid Methods for Reinforcement Learning in Controlled Diiusion Processes

نویسنده

  • Stephan Pareigis
چکیده

Reinforcement learning methods for discrete and semi-Markov decision problems such as Real-Time Dynamic Programming can be generalized for Controlled Diiusion Processes. The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic diierential equation of Hamilton-Jacobi-Bellman (HJB-) type. Numerical analysis provides multi-grid methods for this kind of equation. In the case of Learning Control , however, the systems of equations on the various grid-levels are obtained using observed information (transitions and local cost). To ensure consistency, special attention needs to be directed toward the type of time and space discretization during the observation. An algorithm for multi-grid observation is proposed. The multi-grid algorithm is demonstrated on a simple queuing problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Reinforcement learning methods for discrete and semi-Markov decision problems such as Real-Time Dynamic Programming can be generalized for Controlled Diffusion Processes. The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic differential equation of HamiltonJacobi-Bellman (HJB-) type. Numerical analysis provides multigrid methods for this ki...

متن کامل

Mini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism

This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

Using Multi-Agent Options to Reduce Learning Time in Reinforcement Learning

Distributed multi-agent learning has recently received significant interest but also proven to be very complex as the decisions made by any individual agent are not the only factors in the outcomes of those decisions. Uncertainty associated in the decisions and exploration choices of other agents add complexity and delay to individual learning processes. To address this complexity and provide f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996